Nomogram for predicting invasive lung adenocarcinoma in small solitary pulmonary nodules

Background This study aimed to construct a clinical prediction model and nomogram to differentiate invasive from non-invasive pulmonary adenocarcinoma in solitary pulmonary nodules (SPNs). Method We analyzed computed tomography and clinical features as well as preoperative biomarkers in 1,106 patients with SPN who underwent pulmonary resection with definite pathology at Qilu Hospital of Shandong University between January 2020 and December 2021. Clinical parameters and imaging characteristics were analyzed using univariate and multivariate logistic regression analyses. Predictive models and nomograms were developed and their recognition abilities were evaluated using receiver operating characteristic (ROC) curves. The clinical utility of the nomogram was evaluated using decision curve analysis (DCA). Result The final regression analysis selected age, carcinoembryonic antigen, bronchus sign, lobulation, pleural adhesion, maximum diameter, and the consolidation-to-tumor ratio as associated factors. The areas under the ROC curves were 0.844 (95% confidence interval [CI], 0.817–0.871) and 0.812 (95% CI, 0.766–0.857) for patients in the training and validation cohorts, respectively. The predictive model calibration curve revealed good calibration for both cohorts. The DCA results confirmed that the clinical prediction model was useful in clinical practice. Bias-corrected C-indices for the training and validation cohorts were 0.844 and 0.814, respectively. Conclusion Our predictive model and nomogram might be useful for guiding clinical decisions regarding personalized surgical intervention and treatment options.

A recently proposed pathological classification is that lung adenocarcinoma should be categorized as pre-invasive pulmonary adenocarcinoma (IPA) and IPA.Pre-IPA lesions comprise AAH, AIS, and MIA (14,15).Clinical treatment tends to differ between pre-IPA and IPA; sublobar resection might be reasonable for pre-IPA lesions because the 5-year survival rate after complete resection is ~100%, whereas standard lobectomy and lymph node dissection coverage might be suitable for IPA (15,16).However, to distinguish pre-IPA from IPA lesions is difficult in the absence of complete preoperative histological sampling, which limits optimal treatment planning (17).Therefore, an effective preoperative risk prediction model is needed to predict IPA risk.
Numerous prediction models, including the most well-known Mayo model, the Brock University model, the Peking University People's (PKUPH) model, the VA model, and others, have been developed to date for SPN diagnosis.Over 80% of these models have demonstrated diagnostic accuracy.Every model, in the meantime, has flaws of its own and requires more optimization.
A nomogram is a reliable tool for creating simple visual graphs of statistical predictive models to quantify the risk of clinical events such as cancer (18,19).The high incidence of lung adenocarcinoma prompted us to develop a risk prediction model to differentiate IPA from pre-IPA in patients with isolated lung nodules and to establish a nomogram combining CT and clinical features to determine IPA risk in patients with SPNs to support clinicians' treatment recommendations.

Patient selection
The Ethics Committee of Qilu Hospital, Shandong University approved this single-center study (registration number: KYLL-202008-023-1) and waived the need for written informed consent due to its retrospective design.All procedures complied with the principles enshrined in the Declaration of Helsinki (2013 amendment).
This study included patients with small SPNs with clear pathology who underwent minimally invasive pulmonary resection between January 2020 and December 2021 at the Department of Thoracic Surgery, Qilu Hospital, Shandong University.Inclusion criteria comprised: a single intrapulmonary nodule suggested by chest CT within 1 month before surgery, SPN diameter ≤ 20 mm, absent pulmonary atelectasis and active lung inflammation, surgical resection to obtain definitive pathological findings.Asymptomatic at diagnosis, and no preoperative treatment.Exclusion criteria comprised age < 18 years, open thoracic surgery, incomplete perioperative data, history of malignant disease within 5 years, and metastatic tumors.All those who met the criteria were randomly assigned using a random split sample method to training and validation cohorts in a 7:3 ratio to respectively develop and verify the performance of a prediction nomogram.
All scans were performed with Iopromide injection 300 contrast enhancement from the base to the apex of the lung using either a 64-slice multi-detector CT (Aquilion 64; Toshiba Medical Systems) or a 16-slice multi-detector CT (Somatom Definition AS, Siemens Healthcare, Erlangen, Germany).The patients were lying supine when the scans were obtained at the conclusion of inspiration.The scanning parameters were 50 mA, 1 mm collimation, 1.5:1 pitch, and 120 kVp.With filtered back projection, a 2 mm slice thickness, and a 2 mm increment, the data were recreated using a smooth convolution kernel (Siemens B30f or Toshiba FC02).Computed tomography images of the entire chest during deep inspiration and breath-holding were acquired from supine patients.Two radiologists with > 5 years of experience in chest radiology independently measured each imaging feature, and another with >20 years of experience in chest radiology reassessed discrepancies.Disagreements were resolved by consensus.Centrality was defined nodules in the bronchi, lobar bronchi, or lung segmental bronchi.Peripheral location was defined as nodules found below the tertiary bronchus.Spiculation was defined as the spread of strands from the nodal margins into the lung parenchyma without contacting the pleural surface.Calcification signs on CT images were defined as stratification, central nodule, bronchi, diffusion, or popcorn.Cavitation signs were defined as gas-filled spaces that are considered as transparent or low-attenuation regions.Vascular penetration was assumed when a pulmonary artery crossed a node.Pleural adhesion was defined as linear attenuation of the pleura or a major or minor fissure from the SPN.The bronchial sign indicated direct bronchial involvement of the nodules.Lobulation was defined as a wavy or fan-shaped portion of the lesion surface, with strands extending from the nodal margins into the lung parenchyma.Pleural effusion was defined as blunting of the ribdiaphragm angle.Mediastinal lymph node enlargement was noted.The CTR is the ratio of the diameter of the solid component of a lung nodule to its maximum diameter.
All pathological specimens were fixed in formalin, stained with hematoxylin and eosin, and histologically evaluated by two experienced lung pathologists using a light microscope.All specimens were categorized according to the International Association for the Study of Lung Cancer/American Thoracic Society/European Respiratory Society classification of lung adenocarcinoma (20).
We assigned patients with SPN diameters ≤ 2 cm to pre-IPA and IPA groups.The pre-IPA group included patients with AAH, AIS, MIA, and benign lesions.

Statistical analysis
All data were statistically analyzed using SPSS (version 26.0;IBM Corp., Armonk, NY, USA).Normally distributed continuous variables are expressed as means ± standard deviation (SD) and compared using Student t-tests.Non-normally distributed continuous variables are expressed as medians with interquartile ranges (IQRs) and two groups were compared using Mann-Whitney U tests.Categorical variables were compared using Pearson chi-square or Fisher exact tests.The statistical significance of differences was defined at P < 0.05.All risk factors affecting the probability of IPA in the training cohort were evaluated using univariate analysis, then all those with p < 0.05 in were included in multivariate logistic regression analysis using R statistical software (Windows version 4.2.1, http://www.r-project.org/).A predictive model for SPN was constructed based on the results of multiple logistic regression analyses.The area under the receiver operating characteristic (ROC) curves (AUC) was determined.Scores for each variable were calculated using a regression model, and the predictive probability of IPA was derived by adding the scores for each variable.Nomograms were built and calibration curves were generated using the regression modeling strategies (rms) package in R. The ROC curves were plotted using the pROC package in R.

Nomogram performance
The performance of the predictive nomogram was assessed based on discriminatory power, calibration, and clinical utility.The ability of a model to correctly distinguish between events and non-events is called discrimination.to We evaluated the recognition efficiency of the predictive nomograms using ROC curves (21).Calibration measures the extent to which the predicted probabilities matched the actual results.We assessed calibration capability using Hosmer-Lemeshow tests with p > 0.05 indicating satisfactory calibration (22).A nomogram map was created to further evaluate the calibration.Internal verification proceeded by bootstrapping samples1,000 times (23).The clinical effectiveness of the predictive nomograms was evaluated based on the net benefit of different threshold probabilities using decision curve analysis (24).The optimal cutoff value was determined when the Youden index (sensitivity + specificity-1was maximal based on the results of ROC curve analyses of the training cohort.

Characteristics of the patients
2213 original patients who had surgery at our institution between January 2020 and December 2021 were included in our research.The initial patients were not chosen; they were all sequential.Following a series of screening steps, 1,106 suitable patients were eventually enrolled in our research.Figure 1 shows the process of identifying and selecting 1,106 eligible patients, among whom 163, 188, 233, and 522 had benign nodules and AAH, AIS, MIA, and IPA, respectively.All patients were assigned to pre-IPA (n = 584) and IPA (n = 522) groups based on nodule invasiveness.The patients were then randomly assigned to training (n = 776) or validation (n = 330) cohort at a 7:3 ratio.with No variables significantly differed between the cohorts (Table 1).The training cohort comprised 406 and 370 patients with pre-IPA and IPA nodules and the validation cohort comprised 178 and 152 patients with pre-IPA and IPA nodules, respectively.Table 2 shows he characteristics of the patients in the training and validation groups.
Where e is the natural logarithmic base, e = 2.718 281 828, and x is the logistic regression coefficient.The units of maximum diameter, age, and CEA are cm, years, and ng/mL, respectively.
Based on the coefficients of the multiple logistic regression model, a nomogram predicting the IPA of SPN ≤ 2 cm was drawn using the rms package in R (Figure 3).This nomogram comprised 10 axes, of which axes 2-8 represent the seven variables in the prediction model.By plotting a line perpendicular to the highest point axis, the estimated score for each risk factor was calculated and summed to obtain the total score.The total point axis was used to predict the probability of preoperative IPA of SPNs measuring ≤ 2 cm.An appropriate surgical method could then be selected.

Predictive performance and validation of the nomogram
We assessed the discriminative power of the prediction model and nomogram using ROC curves.The AUCs of the ROC curve were 0.844 (95% CI, 0.817-0.871and 0.812 (95% CI: 0.766-0.857)for the training and validation cohorts, respectively, indicating that the predictive accuracy of the nomogram was relatively good (Figure 4).Nevertheless, overfitting might have caused the high AUC values.The ROC curve truncation value for the training cohort was 0.432, with sensitivity and specificity of 0.803 and 0.724, respectively (Figure 5; Table 5).
Calibration power was evaluated using Hosmer-Lemeshow tests and calibration plots.The values for p in the Hosmer-Lemeshow test were 0.068 and 0.290 in the training and validation cohorts, respectively, indicating no significant differences between the predicted and actual probabilities.Good calibration of the predicted nomogram was also supported by  Nomogram to predict probability of IPA for SPN ≤ 2 cm.CEA, carcinoembryonic antigen; CTR, consolidation-to-tumor ratio.

Clinical utility of the predictive nomogram
We assessed the clinical utility of the nomograms using decision curve analysis.The nomograms in Figures 7A, B, provided greater net benefit and broader threshold probabilities for predicting the risk of IPA of SPN ≤ 2 cm in diameter in the training and validation cohorts, indicating that nomograms were clinically useful.We also created clinical impact curves (Figure 8) to enable surgeons to make better clinical decisions.

Discussion
Optimal management of patients with SPN is a growing clinical concern.Pathological IPA of persistent SPNs is important to assess because clinical management strategies for pre-IPA and IPA lesions are variable.We developed a clinical prediction model and visual diagnostic nomogram for individualized preoperative prediction of IPA of SPN with diameters ≤ 2 cm by retrospectively analyzing the hematological indices, imaging characteristics, and general clinical information of 776 patients in the training cohort.We identified age, CEA values, bronchial signs, lobulation, pleural adhesions, maximum tumor diameter, and CTRs as independent predictors of IPA.Our nomogram predicted patient-specific IPA probability with excellent discrimination and outstanding calibration.
Age is an important clinical factor.The capacity of cells to renew and repair epithelial damage caused by carcinogens decreases, whereas tumor malignancy increases with advancing age (25-27).Although we found that age correlated with IPA, it was the least influential factor.
Carcinoembryonic antigen is a polysaccharide protein complex involved in cell adhesion, which is usually absent or minimal in healthy adult blood and it might be linked to the poor prognosis of tumors (28).Elevated serum CEA levels are significant predictive markers of early relapse (29), progression (30), and treatment outcomes.Our findings showed that CEA can predict the IPA of SPN, which was consistent with these previous studies.
Lobular signs are more prevalent in invasive than preinfiltrative lesions (31).Bronchial changes can predict IPA (32).These morphological features are associated with active fibroblast proliferation in adenocarcinomas and are caused by fibrous tissue contraction (33).This has been confirmed by others, suggesting that activated fibroblast proliferation in adenocarcinoma is associated  with aggressive tumor growth (34).In addition, the insignificance of spiculations here might be attributable to their low abundance.Subpleural nodules or tumors in contact with the visceral pleura or linear clouding, which is vertical and intersects the visceral pleura, might result in pleural adhesion (35).Pleural adhesions are associated with tumor invasiveness and a poor prognosis (36-38).
Our findings suggested that lobar, bronchial, and pleural adhesions are more likely features of invasive lung adenocarcinoma.The size of nodules increases in parallel as lung adenocarcinoma becomes more invasive (39,40).Our findings confirmed this.Moreover, the maximum nodule diameter was the most influential factor for IPA in the present study.
The CTR is an imaging feature of small lung adenocarcinomas and is the ratio of the diameter of solid tumors to that of the total tumor (41)(42)(43).It is an established radiological parameter used to identify pathologically noninvasive tumors on CT images (43,44).We found that the CTR positively correlated with IPA.Thus, a higher proportion of solid components is associated with more invasive SPNs.
We used data from Qilu Hospital to develop and validate a new predictive model and clinical prediction nomogram that can help thoracic surgeons use preoperative information to assess risk of IPA in patients with SPNs.Patients with high scores underwent curative lobectomy, whereas those with low scores underwent sublobar resection.Consequently, modeling to distinguish between IPA and pre-IPA in patients with SPNs can improve their management and prognosis.
The PKUPH model was said to be better than conventional models, whereas the Mayo model was the most often used model for B A FIGURE 7 Decision curve analysis of predicted nomogram in training (A) and validation (B) cohorts.Y axis, net gain; black and grey lines, hypotheses that SPNs with diameter ≤ 2 cm are pre-IPA in nature and that SPNs ≤2 cm in diameter are IPAs.respectively.Blue (A) and red (B) lines, training and validation cohorts, respectively.

B A
Calibration curves of prediction nomogram in training (A) and validation (B) cohorts.X and y axes respectively represent probability predicted by nomogram and actual probability of SPN ≤ 2 cm being IPA.Black dashed, blue and red solid lines, ideal, apparent (uncorrected), and deviation (corrected) curves the bootstrap method (B = 1,000 samplings).SPN, solitary pulmonary nodule.
predicting malignant SPN.A more precise forecasting technique based on CT scans and descriptions of clinical data is the Brock model.Nevertheless, clinical indicators were not incorporated in these models.Chinese mainland populations are not a good fit for foreign prediction models.Certain prediction models integrate more complex and quantitative imaging data into their evaluations, such as tumor diameter growth rates and CT attenuation.However, due to their difficulty in obtaining, conducting, and standardising, these imaging data are rarely recognised and utilised by physicians.Unlike previous studies (45), we introduced benign tumors and combined them with the pre-IPA group.This grouping method is useful for predicting the prognosis of patients and it has value in guiding clinical decisions.This is because the possibility of benign tumors cannot be completely excluded from clinical SPNs.Moreover, we incorporated basic clinical patient information, imaging features, and hematological findings to establish a clinical prediction model with comprehensive preoperative information.The combination of preoperative clinical predictive results and rapid intraoperative pathological findings allows accurate and safe realization of nodal aggressiveness and the development of treatment strategies that are specific for individual patients.In cases where predictive modelling suggested, before surgery, that there was a high likelihood the nodule was invasive, we operated on the patient and performed a lobectomy.If preoperative predictive modelling suggested that the nodule was likely non-invasive, we performed a sublobar excision to maintain the patient's lung function.Each patient therefore receives a customised diagnosis and course of care.
This study had several limitations.We included only patients who underwent surgical resection in our department.Those who did not undergo surgical resection were excluded, which represents selection bias.The subjectivity of radiologists might have led to different judgments of the CT images of pulmonary nodules.Our model was limited by the retrospective design of study.Our data were derived from a single center with a relatively small sample size.The predictive model has only been validated internally, and further validation involving multiple centers and sufficient samples are needed.Although the validation of the model showed good discriminatory and calibration capabilities, the generalizability of nomograms to new patient populations remains a major issue.However, the nomogram requires further external validation.

Conclusions
We developed and validated a novel and easy-to-use nomogram for predicting the risk of IPA in patients with SPN ≤ 2 cm in diameter.With excellent differentiation and calibration, clinicians and surgeons can accurately develop specific treatment strategies for each patient.

FIGURE 4 ROC
FIGURE 4 ROC curves of nomograms predicting IPA in training and validation groups.AUC, area under the ROC curve; ROC, receiver operating characteristic; SPN, solitary pulmonary nodule.

FIGURE 5 Complex
FIGURE 5Complex ROC curves for nomograms to predicting IPA of SPN ≤ 2 cm in the training cohort.AUC, area under the ROC curve; ROC, receiver operating characteristic; SPN, solitary pulmonary nodule.

TABLE 1
Characteristics of patients in training and validation cohorts.

TABLE 2
Clinical characteristics of patients with IPA and pre-IPA in the training and validation cohorts.

TABLE 3
Univariate and multivariate logistic regression analysis of IPA factors of SPN ≤2 cm in the training cohort.

TABLE 4
Details of the predictive model used to calculate the probability of IPA for SPN measuring ≤2 cm in diameter.

TABLE 5
Results of ROC curve for training cohort.